Endpoint detection using weighted finite state transducer
نویسندگان
چکیده
In this paper, we discuss the possibility of applying weighted finite state transducer (WFST) as a unified framework to solve endpoint detection problem. In general, endpoint detection is composed of two cascaded decision processes. The first process is voice activity detection (VAD) which makes framelevel speech/non-speech classification. The second process is utterance-level detection which makes final decision with state transition control and heuristic knowledge. In recent, statistical model-based approach is common on VAD but rule-based logic is dominant on utterance-level detection. However, such an approach can cause some problems. First, it requires expert knowledge to define rules and it also requires sophisticate implementation to avoid confliction among them. Second, it can yield suboptimal performance because each process has to be dealt with independently. Therefore, in order to handle these problems by integrating the two processes, we propose WFST-based endpoint detection framework. The experimental result shows that the endpoint detection problem can be solved in a straightforward way under the proposed framework.
منابع مشابه
Flexible Speech Synthesis Using Weighted Finite State Transducers
Flexible Speech Synthesis Using Weighted Finite State Transducers
متن کاملA phrase-level machine translation approach for disfluency detection using weighted finite state transducers
We propose a novel algorithm to detect disfluency in speech by reformulating the problem as phrase-level statistical machine translation using weighted finite state transducers. We approach the task as translation of noisy speech to clean speech. We simplify our translation framework such that it does not require fertility and alignment models. We tested our model on the Switchboard disfluency-...
متن کاملOpenFst: An Open-Source, Weighted Finite-State Transducer Library and its Applications to Speech and Language
Finite-state methods are well established in language and speech processing. OpenFst (available from www.openfst.org) is a free and open-source software library for building and using finite automata, in particular, weighted finite-state transducers (FSTs). This tutorial is an introduction to weighted finitestate transducers and their uses in speech and language processing. While there are othe...
متن کاملEfficient Morphological Parsing with a Weighted Finite State Transducer
This article describes a highly optimized algorithm and implementation of a deterministic weighted finite state transducer for morphological analysis. We show how various functionalities can be integrated into one machine, without sacrificing performance or flexibility, and and still maintaining applicability to various languages. The annotation schema used in this implementation maximizes inte...
متن کاملDysarthric Speech Recognition Based on Error-Correction in a Weighted Finite State Transducer Framework
In this paper, a dysarthric speech recognition error-correction method in a weighted finite state transducer (WFST) framework is proposed to improve the performance of dysarthric automatic speech recognition (ASR). To this end, pronunciation variation models are constructed from a context-dependent confusion matrix based on a weighted Kullback-Leibler (KL) distance between triphones. Then, a WF...
متن کامل